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Abstract 

According to the statistical interpretation of quantum theory, quantum computers form a distinguished class of 
probabilistic machines (PMs) by encoding n qubits in 2n pbits (random binary variables). This raises the possibility 
of a large-scale quantum computing using PMs, especially with neural networks which have the innate capability 
for probabilistic information processing. Restricting ourselves to a particular model, we construct and numerically 
examine the performance of neural circuits implementing universal quantum gates. A discussion on the physiolog- 
ical plausibility of proposed coding scheme is also provided. 
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1 Introduction 

Neural networks are naturally evolved systems for in- 
formation processing. Despite decades of experimen- 
tal and theoretical research, there is no agreement upon 
the information encoding employed by these circuits - 
the problem of what exactly is being communicated via 
seemingly chaotic spike trains is still largely open [1]. 
Advancement in understanding of this neural language 
is obstructed by variety of cell types, working conditions 
and molecular factors to be taken into account [2]. Gen- 
erally accepted schemes, the rate code and the phase code, 
may turn out to be only the first two in a sequence of 
progressively more intricate codes, where higher order 
correlations within cellular complexes are utilized. 

Quantum information science, on the other hand, 
had matured over the last two decades making signifi- 
cant contributions to both information theory and quan- 
tum mechanics (QM). The latter, having historical roots 
in particle physics, is still often identified with the micro- 
world. Yet, there is nothing in the mathematical founda- 
tions of QM which could justify that point of view. In 
fact, apart from that microscopic realizations, quantum 
theory has found many avatars, from mechanical [3], lin- 
guistic [4], purely geometric [5], to statistical [6-9]. In 
this article, that last, widely accepted interpretation, is 
being used to study the feasibility of a hypothesis that 
spike trains may actually encode for quantum states. 

Such hypothesis appears particularly attractive in 
that the Nature is notorious in repeating itself at vari- 
ous scales, and if quantum computing (QC) proves to be 
practical, it would be rather surprising if one could not 
find it implemented at a higher level. From this point 
of view neural networks are the obvious candidates for 
such implementations. By examining two neural cir- 
cuits, designed to perform quantum operations (1-qubit 
rotations, and 2-qubit CNOT gate), we demonstrate the 
feasibility of our hypothesis within the limits of a sim- 
ple model. Although quantum registers are realized ef- 
ficiently with just two neurons per qubit, the major costs 
are in the processing of information carried by the spike 
trains. The simulations provided are intended to empha- 

1 This is in close analogy to complex numbers which extend the reals, 
dimension equipped with complex structure. 



size the amount of these resources as well as the func- 
tionality required for implementation. 

We begin with a short review of the formalism which 
allows for the identification of pairs of spiking neurons 
with qubits. In Section [3] a reduced model of neural net- 
work is described, which in Sec. |I]is further used as a 
basis for construction of quantum gates. The results of 
simulations, in terms of achieved fidelity and coherence, 
are promising enough to look toward more realistic im- 
plementations. We touch briefly on these issues in the 
last section. 



2 Manipulation of quantum states embed- 
ded in probabilistic space 

The operational approach to quantum mechanics, 
through the formalism of positive operator-valued mea- 
sures (POVMs), allows one to express the states of a 
quantum system defined in a finite-dimensional Hilbert 
space Jff, in terms of probability distributions. If the 
dimension is d := dim J$?, then a generic density ma- 
trix g representing the state has d 2 — 1 degrees of free- 
dom (DOFs). A distribution obtained through particular 
POVM has length d 2 , and - due to normalization con- 
straint - the same number of DOFs as the density ma- 
trix [8]. For n-qubit states, this distribution can be asso- 
ciated with joint probability of 2n binary random vari- 
ables 1 . 

Let g be a generic density matrix of a 1-qubit state, 
which using summation convention, we write as 



q := g^ a p 
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where &o = % 01,2,3 are the Pauli matrices, g° = \, and 
Q 1 : Q 2 > Q 3 G [ — i) §] are the three real coordinates of a 
Bloch vector. Let {A z } be a normalized 4-element pos- 
itive operator-valued measure 

* = (),..., 3. (1) 

Z 

and at the same time are embeddable in a real vector space of doubled 
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Typically, one associates such a POVM with the Pauli ba- back into itself. We denote this subset - the (closure of) 
sis, that is positive domain of quantum operators - by 



A z 



1,...,3. 



where A z a = ^anda^ 1 := <jt = a^, is the basis dual with 
respect to the scalar product (a^ov) := | trfo^ov] = 
S^ v . Although not a strict necessity, it is reasonable to 
assume the same POVM for all n qubits within a register, 
and consequently take the entire POVM as a ri-fold tensor 
product 



A z 



A z 



This leads to the following distribution 

pn...*n .- (A Zl - z ",g) = 2- 2n + A Zl n ---A z " ln Q n - 1 ", 

v z — = i. 

Z\...Z n 

Introducing the event basis {e 2 }, the transformation can 
concisely be written as 



P = Ag, 
where 



(2) 



p = p Zl - z " e zi ® ••• <g> e 
A = A 1 ^ 1 ■ ■ ■ A n fj, n Hzi 



U'l 



Conversely, if {A z }, are linearly independent, then one 
can invert the relation l|3 and take the distribution p 
as an equivalent representation of the quantum state 

q = A~ l p. 

A unitary transformation U € U(2 n ) of the state is a 
linear operator 2 Lei® SO(2 2n - 1) 

g i— ► Wg U = Lg, 

with elements 

= (cr^ 1 ® • • • <g> , U^a Vl ®---®a Vn U). 



(3) 



n 2 ? 



{p g fl ln | ALA~ p e fl 2n } 



This is simply the image of all quantum states under the 
POVM A. The boundary J? 2 ™ := dJff\ which is the 
image of the Bloch sphere in Q 2n contains pure states, 
while its interior i? 2 " := fPp \ J7g" is the subset of mixed 

states. All remaining distributions J? 2 ™ := fi 2n \ 
are mapped by A^ 1 to the exterior of the Bloch sphere. 
Therefore, the POVM partitions the set of possible distri- 
butions into three disjoint subsets: 



n 2 ™ = int n 2n 
n 2 ! 1 = ext n 2n 



- pure quantum states 

- mixed /decohered states 

- overcohered states 



To explain the term overcohered used above, let us take 
a closer look at the limitations imposed by the POVM 
on distributions in i? 2n . Positivity of A implies, that the 
probabilities are bound by 



P 



z " s$ 2(p) = 2{A z g°) n = 2 1 - 



Furthermore, if, as we assume, A is non-degenerate, then 
for any quantum state only one of the elements {p Zl ~ Zn } 
can either vanish, or reach the maximal value 2 (p) . This 
means, that there is a non -zero lower bound on the en- 
tropy of distributions in SI 2 ! 1 , and hence no distribu- 
tion with certain outcome can represent a quantum state. 
Moreover, all single-pbit marginals are non-vanishing. 

A quantitative characterization of the coherence can be 
given by the radius of the state's Bloch vector. The met- 
ric g : n 2n x Q 2n -> R induced by the POVM on the 
distribution space permits to obtain this radius directly 
for an arbitrary p G fl 2n . Let Pi,P2 £ and Q\, q~2 be 
corresponding quantum states. Then g is given by 

g{pi,p 2 ) := tx[g\h] = (A- 1 ^ ■ (A- 1 p 2 ). 

Because this is a bilinear map with coefficients indepen- 
dent of pi , f>2, one is free to extend its domain onto the 



After transformation of the basis A 1 : -» {e z } one entire gpace Q 2n_ gince ^o^ = ^ the mdius j g 



has the same operation acting on probability distribution 



:= 



= 2- n g(p,p)~2 2 



{ALA- X )p. 



(4) 



In particular, for a pure state 

r 2 =2-"(l-2-"), 



and the Bloch radius of any mixed state is always bound 



by r < 



R := 



pure 



The ratio 



There is, however, an important difference between the 
linear dynamics of Eq. Q) and Markovian transitions 
usually considered in association with stochastic evolu- 
tion: Denote by fl 2n the space of joint probability dis- 
tributions of 2n pbits. Since the operator ALA^ 1 is by 
definition invertible, it follows that, in general, it is not a 
positive one, hence only a subset of Sl 2n will be mapped 

2 Note, that the embedding allows to consider a wider range of isometries to be implemented, not only the ones corresponding to unitary 
operations. For instance the 1-qubit antipode (unfortunately also called the quantum universal-NOT) can only be approximated in unitary 
QM [10, 11]. In probabilistic approach one can realize it exactly. 



' pure 



(5) 
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can be adopted for a measure of coherence - ranging 
from R = for maximally decohered state g = 2~ n l, 
through R = 1 for any pure g = g 2 , and beyond R > 1 
for all overcohered ones. 

In order to quantify the performance of circuits con- 
sidered latter in this article, we will also employ another, 
independent measure by which one can estimate the an- 
gular disparity between expected and obtained states. 
The fidelity or normalized overlap between p,q e fl 2n 
is defined here as 



F := 



dip, q) 



\/g(p,p)\/g(q, I) 

We choose fidelity as a commonly adopted measure, 
for the purpose of comparison, notwithstanding direct 
estimate of the unitary error between the desired pure 
state p e f2 2n and obtained distribution q e J? 2 ™, which 
is readily computable: 



a = arccos 



g(p,q)-2- n 



r(g) v / 2" - 1 

The two quantities a and F, are nevertheless dependent. 

3 A toy-model neural-network 

The information in neural networks is carried by spike 
trains, which after appropriate discretization can be 
transformed to binary strings. The model network de- 
scribed in this section is a much simplified version of 
what usually is considered realistic - the purpose of such 
reduction is to retain only the essential features. Con- 
sistently with discretization of transmitted signals, the 
model operates in explicitly step-wise manner, instead 
of continuous-time evolution. Likewise, the delays ef- 
fected along the inter-neuron paths are also taken to be 
integers. 

Let Sf = {Y, §), be a multiply-connected digraph, 
where Y = {ii} is the vertex basis of neurons, (we shall 
also write Yn to explicit the number N of vertices in- 
volved) and $ = Y ® Y ® N is the basis of edges, that is 
the possible synaptic connections. The actual couplings 
between i th and j th neuron are set by the weights W % j s 
where s € N enumerates the delays introduced along 
multiple edges. For each vertex we define two variables: 
the binary output state X 1 e {0, 1}, and the residual poten- 
tial u'et. 

We adopt the discrete integrate-and-fire scheme for the 
dynamics of this network. In each time step the potential 
is first updated by accumulating the incoming signals 



u <.*-i i ► u « : = u 1 *- 1 



+ U'VY" 



where the summation runs over connected vertices (j) 
and edge delays (s ^ 1). Subsequent spike genera- 
tion (X lt — 1) occurs with probability P{u'f), where 
P : M — > [0, 1], is a 'noisy' activation function with fir- 
ing threshold fixed at u t hr 



simulations is given by 



2 1 + Crf ^^ 



where a ^ is a global control parameter characterizing 
the noise standard deviation (SD). In particular, in the 
limit o —*{) the spikes are produced deterministically as 
P becomes a step-function. The excited state u+ is even- 
tually reduced by release of a spike (refractory potential), 
and further quenched with a bound, nonlinear map S 

We assume 5* to have an attractive fixed point at the 
origin (the resting potential), V„ : lim^oo 5*(u) = 0, 
to be linear in its neighborhood S"(0) = 1, and hav- 
ing finite, but non-zero asymptotes |5Y±oo)| < oo. The 
motivation for introduction of this mapping is twofold: 
First, the physiological mechanisms of signal transmis- 
sion imply existence of saturations in both positive and 
negative direction. The cell can be depolarized or hy- 
perpolarized through synaptic channels only to certain 
extent, and adding more excitatory or inhibitory connec- 
tions will not have a significant effect. Second, the reason 
to have u = for an attractive fixed point, is to imitate 
the 'leaky' integration scenario, by which in the absence 
of input the potential returns back to its resting point. In 
the simulations this function was taken to be a simple, 
skew-symmetric mapping 



S(u) := 7tanh ■ 



7 



7^0. 



Here, the asymptotes are S(±oo) = ±7, therefore we call 
7 the 'saturation parameter'. If we assume, that the neu- 
ron is left without input and some residual u, so that no 
spikes are generated, then the potential u will decay sub- 
exponentially in time, as 



S\u) 



\. Its actual form used in 



where 5* means i-fold composition. In the limit 7^0, 
the residual potential is reset to zero after each cycle, and 
this situation can be associated with time steps longer 
than the total refractory time (~ 20 ms), within which the 
cell relaxes to its resting point. If 7 > 0, then the proba- 
bility of a consecutive spike is modified by the residual 
potential: The cell is within the relative refractory period, 
when the the potassium channels are still open, but the 
sodium gates are already reverted to their normal state. 
This mode corresponds to time steps of order <~ 5 ms. 
Shorter times are generally unrealistic due to high sup- 
pression of spike generation during the absolute refrac- 
tory period, when the sodium channels are closed. 

The choice of a specific value of 7 is therefore indi- 
rectly related to the time scale, and consequently to the 
discretization window of action potentials. If this win- 
dow is too short, the discretization becomes ambiguous 
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and the model breaks down - this is another reason not 
to consider high saturation values. 

The qualitative behavior of the above model is best 
understood by analyzing single neuron at the limits of 
the two control parameters a, and 7. Assume the cell is 
fed with a stimulus at a constant frequency v m 6 [0, 1], 
and consider at first the noiseless regime a = 0. If 7 = 0, 
then the only memory of past input values is stored in 
delayed connections. The cell fires only if the value of the 
convolution WjsX^ 1-13 exceeds the threshold uthr- Such 
neurons acts like a high-pass filter and its firing rate is 
^out = P(^ia J2 s ,j w js)- By increasing the noise SD a, 
the shape of this filtering function changes along with 
the spiking probability P, nevertheless it never becomes 
close to an ideal multiplier - the response is always non- 
linear. 

If 7 — * 00 the cell accumulates and 'remembers' the 
residual value of convolution left over after subtraction 
of generated spikes. This makes it into a perfect mul- 
tiplier with spike rate ^ out = fi n E s j Wj s . Raising a 
above zero does not change this average response, but 
the determinism initially apparent in the spike patterns 
is gradually being washed away. 

In between of these two regimes, lies a surprisingly 
complex area of fractal-spaced frequency thresholds and 
output patterns, particularly conspicuous at a = and 
v ln = 1 . Presence of these features, found in many non- 
linear deterministic systems do not critically depend on 
the specific shape of the function S. 



4 Implementation of universal quantum 
gates 

According to the discussion provided in section |3 one 
needs In random binary variables to implement an n- 
qubit register. In our model of the neural network, 
these variables are identified with discretized spikes reg- 
istered at 2n network sites. The question we set up to ad- 
dress in this section is, whether there are circuits which 
can implement state-independent rotations of the joint 
probability distributions, that is - quantum gates. 

The set of gates universal for quantum computation 
[12] includes the whole algebra of 1-qubit rotations, and 
an arbitrary 2-qubit entangling gate, typically chosen to 
be the CNOT (controlled-NOT). Although probabilistic 
encoding of qubits is efficient (i.e. linear in n), manip- 
ulation of their 2 2 " degrees of freedom (DOFs), by defini- 
tion requires exponential amount of resources. From this 
perspective the construction of circuits described below 
should appear at least conceptually straightforward: The 
space of binary functions over the vertices is y 2n — 



Q 2n , then apply the gate G := ALA~ X , and finally project 
the result back onto y 2n . The entire quantum gate trans- 
forming one set of spike trains X t to another Y t e y 2n , 
is then a composition 



TTo Go IT 



X 1 1 ► Y t 



where TT" 1 : f 2n -► fl 2n , and 77 o 77" 1 = id^n. The 
main problem in this approach is to construct a reliable 
projection 77, since any information loss during that op- 
eration will affect the quality of entire gate. 

Concrete realization, requires also to decide upon 
particular POVM being used. It is possible to choose this 
transformation in such a way, that some of the gates will 
be significantly simplified, for instance acquiring conve- 
nient form of permutations. Our choice is dictated by 
the optimization of the CNOT gate, discussed latter in 
this section. This POVM is given by 3 
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4.1 Single-qubit gates 

The neural circuit implementing arbitrary 1-qubit gate is 
presented in Fig. The projection 77 which transforms 
the 'sparse' code {X 00 ' , X 01 ' , X 10 ' , X 11 '} e Q 2 onto a 
'dense' one {A" A , X B } e 'V 2 , is a linear mapping im- 
plemented with weights 



W n 



But its inverse, 77 _1 is nonlinear and we realize this 
function, in a two-step linear-feedback operation. The 
first step requires, apart from the input signals, an ad- 
ditional supply of constant 'current' of units from the 
vertex v\. The effect of such a coupling to unity on 
a cell is to alter its firing threshold. The weights of 
this part, effecting a linear injection from {X A , X B . 1} to 
{X 00 ,X°\X w ,X n }axe 



W n -i = 



(- 



\ 



-1 
1 

-1 
1 



1\ 





-V 



While the composition WxiWn- 1 — id as required, the 
reciprocal is not an identity and needs a rectifying feed- 
back sent from the 'winning' neuron to its neighbors, in 
order to bring their residual potentials back to zero. Be- 
cause of the one-step delay, this signal has to be adjusted 

3 From the point of view of state estimation, the optimal POVM is a conformal transformation, which maps the Bloch sphere into a sphere 

inscribed in the standard simplex of K . Thanks to the many symmetries of such geometric configuration, some of the rotations are express- 
ible as permutations of the simplex' vertices and can be implemented with high efficiency. In the case of A given by Eq. |6|, the permutation 
(00, 10)(01, 11) corresponds to the 1-qubit NOT gate. With a different POVM one can bring the Hadamard's gate H to a permutation, therefore 
if an algorithm relies on frequent applications of this operation, that could be a preferred choice. 
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We first embed an element X 1 = {X lt } e "V 2n into 
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to match the attenuation already done by the function 
S. Hence the weight matrix of this rectification step is 
determined by 



the weight matrix of this normalization reads 



= -S(-l)[tn--W I j-tW n } i j . 



(7) 



Note, that for vanishing saturation parameter, this cor- 
rection also disappears due to S = 0. 

In the absence of noise (er = 0), the conversion 77 _1 
between dense and sparse coding is completely error- 
free. As a increases, the imperfections start to appear in 
the form of either multiple, or 'void' spiking in the first 
hidden layer. Although we found the circuit to behave 
stably in these conditions, an improvement, in terms of 
both fidelity and coherence, can be achieved by adding 
a second, normalizing feedback (not shown explicitly in 

Fig-ID- 




nor 



delays 

Figure 1 Schematic for the 1-qubit gate. The hidden nodes are 
drawn in inverted lexicographical order to avoid excessive entangle- 
ment of the graph. Explicit connections of the normalization (nor) 
feedback are omitted, to improve legibility. Inhibitory connections are 
marked with (— •), excitatory with (— «), and those capped with (— |) de- 
pend on the applied gate G. The double connections (-o) consist of 
inhibition followed by delayed and attenuated excitation. The transient 
time is Tg a te = 4 + T avr . Including the normalization feedback, and 
doubled gate connections (T avr = 2), the circuit comprises 10 nodes, 
and 62 edges. 

Normalizing feedbacks are commonly proposed for 
explanation of the observed behavior in cortical neurons 
[16, 17]. The main difference between these and our pro- 
posal is that while the former are multiplicative, this one 
acts additively Its role is to adjust the residual potentials 
for the difference 



i = 00,01,10,11. 



Because we do not know which of the four neurons 
spiked mistakenly, the normalizing signal is sent evenly 
to all of them. Its strength is determined by the average 
excess of a signal encountered on a double-spike event: 



rccj j 



\s(-i) 



The deficit, which happens upon lack of a single spike 
has the same magnitude but opposite sign. Therefore, 
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where the last column refers to the unit vertex v\. 

Making the embedding 77 _1 robust is crucial for 
achieving correct projection 77. Multiple, or void events 
linearly projected via Wn typically lead to a significant 
loss of coherence (although with less impact on fidelity). 
In order to compensate for these non-exclusive events, 
the spiking neuron sends a composite inhibitory signal 
down the hierarchy. This is implemented with double 
connections: the first transmits inhibitory signal at cer- 
tain level rj, and is followed by a delayed, excitatory one 
aimed at bringing the residual potential of target neuron 
back to its prior value. The full accomplishment of this 
goal is impossible with nonlinear function S - the value 
can only be fully restored in the linear limit 7 — * 00. 
Given the inhibitory coupling 77, our best estimate of the 
following excitation strength is 77' = (u) — S((u) — 77), 
where (u) is the average residual potential. Because 
(u) 0, we set rj' = — S(— 77). The optimal value of 77 
was found numerically, by minimizing the variation of 
fidelity over a range of gates acting on test states (see 
Results). 

Application of a gate G requires no additional node 
of the network, only manipulation of the weights be- 
tween the embedding and projecting parts. In simplest 
case these are directly set to 

W G = AIA- 1 = G. 

We have found however, that within some limits, the 
mechanism of synaptic averaging may provide improve- 
ment of the performance. In real networks, a single 
synapse contributes only a tiny fraction of the total input 
signal [2]. Multiple connections of similar lengths lead to 
signal accumulation, different delays - to temporal aver- 
aging. In our toy-model, the first case is replaced by sin- 
gle, but strong connections, while for implementation of 
the latter we directly use several edges having different 
lengths with proportionally attenuated couplings. In the 
case of a single-qubit gate, of the several configurations 
tested, the best results were obtained with just two-step 
average r avr = 2, hence, the connections were fixed at 
W a ,i = W G ,2 



\G. 



Results. In order to reduce statistical uncertainties, all 
gates were tested on a fixed set of 14 pure states approx- 
imately evenly distributed on the Bloch sphere: 
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While the input nodes were fed with spike trains 
{X A , X B } of joint probability distributions p m e J?q cor- 
responding to the above states, the output {X A , X B } 
was tested for its coherence R, and fidelity with the 
desired distribution Gp- m . The initial test runs were 
made for several gates including Hadamard's, NOT, 
the antipode (non-unitary), and two rotations: U$ — 
exp(i6*(T2/2), and U$ = exp(i</><73/2). For the representa- 
tive, the phase gate was selected - the effects of other 
operations were quantitatively similar, or better. Its rep- 
resentation = AL^A^ 1 , acting in J? 2 reads 



( 



1 + COS ( 
1 — COS ( 

sin</> 
— sin d> 



1 — COS ( 
1 + COS ( 

— sin</> 
sin d> 



— sm<p 
sin <f> 
1 + cos ( 
1 — cos ( 



sm q> 
— s\nq 
1 — cos 
1 + cos (j> J 



The results presented in Fig. |2 are averages over 36 ro- 
tation angles evenly spaced across the entire interval 
[0, 27r). The best performance was observed for <f> = 
(identity) and = tt, while the worst cases were encoun- 
tered around <f> m ±ir/2 (but not exactly at these angles). 
For each setting (a, 7), the inhibition level 7? was adjusted 
to minimize the variance of fidelity across the test states 
and rotation angles (cf. Fig. [3J- insets). Note that while 
this optimization was mainly coincident with maximiza- 
tion of the fidelity itself, the trend in coherence was typ- 
ically opposite. Had we chosen to optimize for purity of 
states (R — ► 1), the figures would look different. 

The prominent feature of Fig. is the overcoher- 
ence of output states in the limit 7^0. This means 
these distributions are too sharp to represent quantum 
states, and any subsequent application of another gate 
would certainly lead to a loss of accuracy. Interest- 
ingly, the average fidelity remains at relatively high 
level. This suggests a possibility of correcting the dis- 
tributions by rescaling about the average. On the other 
hand, the fidelity SD is significant for small saturations, 
and becomes comparable with statistical uncertainties 
only above 7 > 1. 

The conclusion drawn from Fig. |2t> is clear: the cir- 
cuit considered here is designed to work in determin- 
istic regime a — > 0. This makes an interesting con- 
trast between stochastic nature of quantum states and 
the determinism of gates acting on them. As we are go- 
ing to show, this dichotomy is not limited to the 1-qubit 
gate, but persists also in the case of entangling operation 
CNOT. 

Finally, we have sought for an estimate of the time 
needed to complete the quantum rotations with this 
gate. Apart from the spatial resources, measured in 
terms of cells and connections being used, time is an im- 
portant factor contributing to the overall cost of the re- 
alization. To assess this property, we have run the cir- 
cuit while varying the signal length r S j g : After an initial 
transient of T gatc = 4 + r avr , the network was ran for 
r sig ^ 1 successive steps, after which the cells were re-set 
to their initial state (u l = 0, X 1 = 0), ensuring that all 



memory traces stored in residual potentials were erased. 
This procedure was repeated until satisfactory statistics 
(ivT s ig w 10 4 ) was gathered. 
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Figure 2 Performance of the 1-qubit phase gate in function of a): 
saturation level of the residual potential, b): noise SD of the activation 
function P, c): length of the input signal (note the scale difference be- 
tween graphs). Synaptic averaging was fixed at r avr = 2. Each point 
is a mean over 36 rotation angles within the whole interval [0, 2n), and 
14 pure states on which the phase gate was tested. With statistics 
of 10 4 steps per setting, the associated uncertainties are negligible - 
shaded regions (F) and broken lines (R) represent the standard de- 
viations across the states and rotation angles. Insets: The inhibition 
levels used during simulations, optimized for minimization of the fidelity 
variance. 

The results provided in Fig. [5J: evidently show that 
the real temporal cost is not only the delay r gato , but a 
significant number of further steps are needed to 'tune' 
this gate to a signal. After approximately r S i g ps 30 
events the output quality no longer improves, and con- 
sequently one can identify r sig with the statistics needed 
for maximal efficiency Since the latter is a function of 
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saturation 7 and noise a, one expects r s i g to raise mono- 
tonically with 7 and decrease as a increases. In partic- 
ular, the ideal case 7 — > 00, a — > would also require 
infinite statistics to achieve the best performance. One 
therefore finds yet another reason for the low saturation 
values: The finiteness of signals encoded in spike trains, 
limits the attainable efficiency of transformations, and 
high saturation values cannot provide improvement be- 
yond these limitations. 

4.2 The CNOT gate 

Unlike the single-qubit gates which can, by means of a 
special choice of the POVM, be transformed to a permu- 
tation, the CNOT operation does not admit such repre- 
sentation 4 . With A given by {6}, its operator Gcnot = 
has the following structure 



Gcnot = 
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As already mentioned, the POVM (6) has been chosen to 
optimize the CNOT gate. Indeed, the linear projection 

H :fl 4 -► V^, 



W n = 



/0 ... 1 1\ 

1 ... 1 1 

1 1 ... 1 1 

\0 1010. ..01/ 



than using the natural order {X A , X B } : {X c , X B }. The 
section 77 _1 is a two-stage procedure: First, with the 
same construction as in 1-qubit case we separately em- 
bed the two marginals {X A , X D } and {X B , X G }: 



n; 



Q x fl . 



Next, we combine these into a single map 



n; 



n 1 x n 1 



Since this is done with a linear mapping, there is again 
a rectifying feedback Wrec.ii obtainable from Eq. ap- 
plied to Tin on SI 4 . In contrast to II j , the second-stage 
embedding 77^ 1 turns out to be unstable against noise, 
and the normalization feedback is now a necessity Be- 
cause of 

V, 2- 4 ^[W r rccJI r j =- 7 |5(-l), 



the normalization weights are set to 

W nor ,ii = -9S(-l)(-£)4el), 

where [£'4]% = 2~ 4 is the diffusion operator, and 'V 
refers to the unit vertex Oi. 
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when combined with Gcnot / shows the two pbits X A 
and X B are invariant under Gcnot- These can be di- 
rectly copied to the output X A , X u , as shown in Fig.|3] 
For that same reason, while implementing the embed- 
ding part IT 1 , we pair {X A , X D }, {X B , X c }, rather 

4 Would it be possible, then either the gate could have no entangling capability, or the marginal probabilities of the two qubits be not con- 
served. 



delays 

Figure 3 Schematic of the CNOT gate. The circuit has 38 nodes, 
and 1309 edges if the synaptic averaging is set to r avr = 4. The tran- 
sient time iS Tgatc = 5 + ravr- 

Thanks to the invariance of two pbits X A — X A , 
X B = X u , the hierarchical projection H have been sig- 
nificantly simplified (in comparison to what is needed 
for general 2-qubit gate). The four partial projections 
from Gcnot/ shown on the right hand side of Fig. [31 are 
modulated directly by the marginal nf 1 (X A , X D ). The 
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mechanism of this modulation is the same as explained 
before (inhibition followed by attenuated excitation) and 
the same parameter rj is set common on those connec- 
tions. 

Finally, the gate edges were multiplied, in order 
to use the synaptic averaging mechanism. We found 
no dramatic improvement while varying the averaging 
length T avr , at 7 > 1, nevertheless the performance was 
significantly better for small values of the saturation pa- 
rameter. At 7 = 1 the optimal length was r avT = 4. 

Results. The performance was assessed upon a testing 
set of 28 pure states, which included both separable and 
entangled ones: 

{|00),|01),|10),|11>, 

^ (|00) + e ife ^ 2 101» , -± (|00) + e ife -/ 2 |10», 

^(|00)+e ife -/ 2 |ll)), -i=(|01) + e^/ 2 |10», 

i (|01) +e ifc -/ 2 |ll», i (|10)+e^/ 2 |ll>)}, 



implement some of the error correcting schemes, never- 
theless the principle of quantum computing with neural 
networks has been demonstrated. 



k G Z 4 



Interestingly, although some of these states are 'pre- 
ferred' in terms of achieved fidelity F, there was no 
correlation between this measure and the entanglement 
property. This observation should not be surprising, be- 
cause the mapping of 2-qubit states into joint probability 
distributions does not make entangled states distinct. It 
follows that even imperfect gate implementation should 
not distinguish these states from separable ones. 

The results of simulations are presented in Fig. |3] 
Like before, for each setting of the control parameters 
(7, a), the inhibition level r\ was adjusted to minimize 
the variance of fidelity across the test states. 

Qualitatively, the figures|3|are largely similar to what 
had been obtained for 1-qubit gates (Fig. the major 
difference is in the range of achieved fidelities and out- 
put coherences. Standard deviations of coherence R in- 
creased evenly by approximately a factor of 2, while the 
fidelity SD multiplied by about 4 — 5. The most dramatic 
changes are observed in Fig. [4^: Whereas at low satu- 
ration values (7 < 1) the 1-qubit gate worked relatively 
well, in the case of CNOT a huge overcoherence takes 
place along with significant fidelity loss. 

For a reasonable performance at a = and T S j g > 
30 one needs 7 > 1. In this regime one finds F > 
0.97 (—0.03, +0.02), corresponding to the unitary error 
a ^ 140 (-5, +4); with noise at a = 0.3 and 7 = 1 
the fidelity drops down to F = 0.77 (-0.11, +0.15), or 
a = 42° (—22, +12), what is hardly acceptable for a large- 
scale quantum computation. While comparing these 
values with the best to-date experimental achievements 
(F ~ 0.7 - 0.8 with trapped ions [13], F ~ 0.6 - 0.8 with 
Josephson junctions [14], F ~ 0.85 in optical setup [15]) 
one has to take into account the many simplifications of 
our toy-model. More realistic simulations, or ultimately 
- realizations, may not necessarily prove as good as this 
one, and would probably require additional resources to 



Saturation 7 




a) 



<T = 



10 4 



Saturation 7 



o 
o 




b) 



7 = 1 



10 4 



0.1 0.2 

Noise SD a 



1.4 



1.0 - 



0.6 



c) 




10 10 2 10 3 

Signal length -r sig 



10 4 



Figure 4 Performance of the CNOT gate for synaptic averaging 
fixed at Tavr = 4. Note the differences in ranges, while comparing with 
Fig. 2. The statistics for each setting is 10 4 steps, and each point is an 
average over 28 test states. 



5 Discussion 

We have studied the potential of an artificial neural net- 
work to operate on correlated spike trains assuming the 
latter to encode quantum states. The model neurons 
are reduced here down to the essential ingredients of 
computational capability. Few comments concerning the 
simplifications made are in order at this point: 

First, we have completely neglected the synaptic 
noise, by assuming the signals to be relayed undisturbed 
between cells. The justification is that here the few edges 
of each node represent averages over 10 3 — 10 4 real 
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synaptic connections therefore the impact of faulty trans- 
mission through a single synapse is greatly reduced. But 
inclusion of this likely source of errors may still be a sig- 
nificant factor reducing the overall performance. 

Second, the time duration of processed signals are as- 
sumed to be much shorter than the synaptic plasticity 
scale. Adaptation is an inherent element of information 
processing in the brain, but it conflicts with the objec- 
tive of reliable signal transformations in that there is a 
trade-off between computing efficiency and adaptive ca- 
pability. The resolution is provided by separation in time 
scales between the two processes - transformations act 
over short signals, typically in response to rapidly vary- 
ing external stimuli. This is consistent with the optimal 
signal length which was found here for both 1- and 2- 
qubit gates to be of order ~ 30 steps. Assuming the time 
step is set to ~ 5 ms leads to a realistic signal duration of 
~ 150 ms. 

Third, the detrimental effect of cellular noise on the 
performance of quantum gates clearly shows the deter- 
ministic regime to be preferable at least for the coding 
scheme considered here. On one hand, a sharp firing 
threshold needed for the neurons to act as 'counters' 
which discretize linearly accumulated input signals, cor- 
roborates with the theoretical analysis of optimality in 
terms of information encoding [21]. But on the other, the 
noise itself which blurs this threshold has been shown to 
be a viable resource acting through the mechanisms of 
stochastic resonance [22]. This suggests to consider al- 
ternative quantum coding schemes, which would make 
use of the inherent uncertainty in spike generation, pro- 
vided the relevant conditions are stable enough (e.g., 
noise variance at a constant, moderate level). 

It is worthwhile to note at this point, that the quan- 
tum states are not absolute entities, and the same set of 
spike trains may be 'quantized' in many different ways 
depending on the assumed definition of a state. Ac- 
cordingly, the quantum transformations as well as their 
implementations will differ. We have discussed here 
only two coding schemes (referred to as the 'dense' and 
'sparse' spatial code), but it appears plausible, that the 
real networks may actually alternate (or combine) many 
different encodings, depending on the nature of the in- 
put signal and the functional properties of the circuit. 
An evident possibility is the sparse temporal code based on 
probability waves, particularly attractive for at least two 
reasons: First, the brain waves provide the frequency ba- 
sis necessary for phase discrimination, and there is an 
experimental indication for independence between rate 
and phase variables [23]. The question is not whether 
the spiking probability oscillation does have a role, but 
rather what is the relevant number of modes involved 
in computation (if more than two then one should con- 
sider qudits instead of just qubits). Second, while the 
'dense' code requires two random binary variables per 
qubit, by trading spatial for temporal resources, proba- 
bility waves allow to encode one-qubit per neuron. The 



drawback is that the mechanisms of short term synap- 
tic plasticity [18-20] makes the neural circuits operating 
on this form of a code susceptible to unwanted modifica- 
tions [30]. From this perspective, the use of sparse spatial 
coding [24-29], appears to be advantageous, since such 
spike trains have by definition no temporal correlations, 
and hence the circuits operating in this fashion are ex- 
pected to be more stable. 

In summary, we have demonstrated the principle of 
employing quantum coding in artificial neural networks, 
by providing examples of circuits which realize quan- 
tum gates. There is a room for improvement and further 
investigation with more realism put into the model, al- 
ternative circuits, and algorithm implementations. Ex- 
ploring the possible ways in which neural networks can 
handle quantum codes, can certainly benefit both the 
quantum mechanics and neuroscience. On one hand, ap- 
plications of QM to neural systems broaden the range 
of possibilities to be considered when seeking to un- 
derstand the language of spikes, on the other - macro- 
scopic realizations can provide clues about the micro- 
scopic phenomena upon which QM originated. 
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